A Multi-Task Scheme for Supervised DNN-Based Single-Channel Speech Enhancement by Using Speech Presence Probability as the Secondary Training Target
نویسندگان
چکیده
To cope with complicated interference scenarios in realistic acoustic environment, supervised deep neural networks (DNNs) are investigated to estimate different user-defined targets. Such techniques can be broadly categorized into magnitude estimation and time-frequency mask techniques. Further, the such as Wiener gain estimated directly or derived by power spectral density (PSD) signal-to-interference ratio (SIR). In this paper, we propose incorporate multi-task learning DNN-based single-channel speech enhancement using presence probability (SPP) a secondary target assist main task. The domain-specific information is shared between two tasks learn more generalizable representation. Since performance of network sensitive weight parameters loss function, homoscedastic uncertainty introduced adaptively weights, which proven outperform fixed weighting method. Simulation results show proposed scheme improves overall compared conventional single-task methods. And joint direct SPP yields best among all considered
منابع مشابه
Speech Enhancement using Laplacian Mixture Model under Signal Presence Uncertainty
In this paper an estimator for speech enhancement based on Laplacian Mixture Model has been proposed. The proposed method, estimates the complex DFT coefficients of clean speech from noisy speech using the MMSE estimator, when the clean speech DFT coefficients are supposed mixture of Laplacians and the DFT coefficients of noise are assumed zero-mean Gaussian distribution. Furthermore, the MMS...
متن کاملSingle-Channel Speech Enhancement Using Double Spectrum
Single-channel speech enhancement is often formulated in the Short-Time Fourier Transform (STFT) domain. As an alternative, several previous studies have reported advantages of speech processing using pitch-synchronous analysis and filtering in the modulation transform domain. We propose to use the Double Spectrum (DS) obtained by combining pitchsynchronous transform followed by modulation tran...
متن کاملSingle Channel Speech Enhancement Using Outlier Detection
Distortion of the underlying speech is a common problem for single-channel speech enhancement algorithms, and hinders such methods from being used more extensively. A dictionary based speech enhancement method that emphasizes preserving the underlying speech is proposed. Spectral patches of clean speech are sampled and clustered to train a dictionary. Given a noisy speech spectral patch, the be...
متن کاملSingle-channel dynamic exemplar-based speech enhancement
This paper proposes an exemplar-based speech enhancement method based on high-resolution STFT magnitude spectrograms, where a selection of the nonnegative training data is used as the dictionary to provide a holistic nonnegative representation of the test data. We discuss how this exemplar-based model ensures that the enhanced speech signal falls on the speech manifold, which improves the quali...
متن کاملA multi-channel speech enhancement framework for robust NMF-based speech recognition for speech-impaired users
In this paper a multi-channel speech enhancement framework for distant speech acquisition in noisy and reverberant environments for Non-negative Matrix Factorization (NMF)-based Automatic Speech Recognition (ASR) is proposed. The system is evaluated for its use in an assistive vocal interface for physically impaired and speech-impaired users. The framework utilises the Spatially Pre-processed S...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEICE Transactions on Information and Systems
سال: 2021
ISSN: ['0916-8532', '1745-1361']
DOI: https://doi.org/10.1587/transinf.2020edp7267